Discovering Interesting Association Rules in the Web Log Usage Data

نویسندگان

  • Eli Cohen
  • Maja Dimitrijević
  • Zita Bošnjak
چکیده

The immense volume of web usage data that exists on web servers contains potentially valuable information about the behavior of website visitors. This information can be exploited in various ways, such as enhancing the effectiveness of websites or developing directed web marketing campaigns. In this paper we will focus on applying association rules as a data mining technique to extract potentially useful knowledge from web usage data. We conducted a comprehensive analysis of web usage association rules found on a website of an educational institution. Our experiments confirm that, prior to pruning, the set of generated association rules contained too many non-interesting rules, which made it very difficult for a user to find and exploit useful information. Many of these rules are a simple consequence of the high correlation between web pages due to their interconnectedness through the website link structure. We proposed and applied a set of basic pruning schemes to reduce the rule set size and to remove a significant number of non-interesting rules. This pruning method decreased the size of our experimental rule set by more than three times, making it much simpler to browse for truly interesting rules. The percentage of truly interesting rules, which can initiate a webmaster to actions that can potentially enhance the website and improve its browsing experience, in our resulting experimental rule set was 41%. The analysis of association rules in our case study confirmed the hypothesis that discovering interesting and potentially useful association rules in web usage data does not have to be a timeconsuming task and can lead to actions that increase the website’s effectiveness.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Indirect Positive and Negative Association Rules in Web Usage Mining

One of the purposes of Web usage mining is to find out interesting user association rules from web server logs. It has become vital for personalization, effective web site management, business and support services, creating adaptive web sites, and so on. In the web domain, items correspond to pages and transactions to user sessions. Indirect associations, type of infrequent pattern provide usef...

متن کامل

A Survey on Approaches for Mining Frequent Itemsets

Data mining is gaining importance due to huge amount of data available. Retrieving information from the warehouse is not only tedious but also difficult in some cases. The most important usage of data mining is customer segmentation in marketing, shopping cart analyzes, management of customer relationship, campaign management, Web usage mining, text mining, player tracking and so on. In data mi...

متن کامل

A Negative Association Rules for Web Usage Mining Using Negative Selection Algorithm

The immense capacity of web usage data which survives on web servers contains potentially precious information about the performance of website visitors. Pattern Mining involves applying data mining methods to large web data repositories to extract usage patterns. Due to the emerging reputation of the World Wide Web, many websites classically experience thousands of visitors every day. Examinat...

متن کامل

An Overview of Web Usage Mining

Web Usage Mining make use of Association Rule Mining to discover the interesting pattern, identify web user behavior, predict web user expectation and improve the business strategy. Association Rule Mining is a technique of Data Mining which is used to find the relationship between the data items. In Web Usage Mining, data are stored in the web server in the form of web log files. Numerous amou...

متن کامل

Utility Pattern Approach for Mining High Utility Log Items from Web Log Data

. Mining frequent log items is an active area in data mining that aims at searching interesting relationships between items in databases. It can be used to address a wide variety of problems such as discovering association rules, sequential patterns, correlations and much more. Weblog that analyzes a Web site's access log and reports the number of visitors, views, hits, most frequently visited ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010